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Abstract 

In recent years there has been interest in the theory of local computation over probabilistic Bayesian 
graphical models. In this paper, local computation over Bayes linear belief networks is shown to be 
amenable to a similar approach. However, the linear structure offers many simplifications and advantages 
relative to more complex models, and these are examined with reference to some illustrative examples. 
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1 Introduction 

Conditional independence graphs are of vital importance in the structuring, understanding and computing of 
high dimensional complex statistical models. For a review of early work in this area, see Jl7| , the references 
and the discussion, and also [|| . The above mentioned work is concerned with updating in discrete probability 
networks. For a discussion of updating in networks with continuous random variables, see Jl5[ , for example. 
For a general overview of the theory of graphical models, see 

Also relevant to this paper is the work on graphical Gaussian models. ]l6|], |^3) and discuss the 
properties of such models. |l8[ ] examine data propagation through a graphical Gaussian network, and apply 
their results to a dynamic linear model (DLM). Here, the aim is to link the theory of local computation over 
graphical Gaussian networks to the Bayes linear framework for subjective statistical inference, and the many 
interpretive and diagnostic features associated with that methodology, in particular. 

2 Bayes linear methods 
2.1 Overview 

In this paper, a Bayes linear approach is taken to subjective statistical inference, making expectation (rather 
than probability) primitive. An overview of the methodology is given in The foundations of the theory 
are quite general, and are outlined in the context of second-order exchangeability in pTOj ] , and discussed for 
more general situations in p~T[ ] . Bayes linear methods may be used in order to learn about any quantities 
of interest, provided only that a mean and variance specification is made for all relevant quantities, and a 
specification for the covariance between all pairs of quantities is made. No distributional assumptions are 
necessary. There are many interpretive and diagnostic features of the Bayes linear methodology. These are 
discussed with reference to [B/D] (the Bayes linear computer programming language) in 



2.2 Bayes linear conditional independence 

Conventional graphical models are defined via strict probabilistic conditional independence jlj]. However, as 
demonstrates, all that is actually required is a tertiary operator • H •/• satisfying some simple properties. 
Any relation satisfying these properties is known as a generalised conditional independence relation. Bayes 
linear graphical models are based on what ppf refers to as weak conditional independence. In this paper, the 
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relation will be referred to as adjusted orthogonality, in order to emphasise the linear structure underlying 
the relation. 

Bayes linear graphical models based upon the concept of adjusted orthogonality are described in ||. For 
completeness, and to introduce some notation useful in the context of local computation, the most important 
elements of the methodology are summarised here, and the precise form of the adjusted orthogonality relation 
is defined. 

For vectors of random quantities, X and Y, define Cov (X, Y) = E (XY T ) -E (X) E (Y) T and Var (X) = 
Cov (X, X). Also, for any matrix, A, A* represents the Moore-Penrose generalised inverse of A. 

Definition 1 For all vectors of random quantities B and D, define 

pf^ } =Cov (B, D) Var (D) f (1) 
T {B} _ V {B} V {D} fl) s 

[D] ~ [D\ r [B] \ A ) 

These represent the fundamental operators of the Bayes linear methodology. P^] is the operator which 

updates the expectation vector for B based on the observation of D, and updates the variance matrix 

for B based on observation of D. Local computation over Bayes linear graphical models is made possible by 
local computation of these operators. 

Definition 2 For all vectors of random quantities B, C and D, define 

E D {B)=E{B)+v\^[D-E{D)] (3) 
Cov D [B, C) -Cov (B -E D (B),C- E D (C)) (4) 

Ed (B) is the expectation for B adjusted by D. It represents the linear combination of a constant and the 
components of D closest to B in the sense of expected squared loss. It corresponds to E (B\D) when B 
and D are jointly multivariate normal. Covd (-B, C) is the covariance between B and C adjusted by D, and 
represents the covariance between B and C given observation of D. It corresponds to Cov (B, C\D) when 
B, C and D are jointly multivariate normal. 

Lemma 1 For all vectors of random quantities B, C and D 

Cov D (B, C) =Cov {B, C) - Cov (B, D) pf^ }T (5) 

Var Z5 (B)=(I-T^ ] } )Var(B) (6) 

Proof 

Substituting (§) into (@) we get 

Cov D (B,C)=Cov(B,C-E D (C)) (7) 

=Cov(b,C-P^ } ^) (8) 

which gives (||), and replacing C by B gives (||). □ 

Note that (||) shows that is responsible for the updating of variance matrices. Adjusted orthogonality 

is now defined. 

Definition 3 For random vectors B , C and D 

BUC/D <(=> Covd (B,C) = (9) 

H shows that this relation does indeed define a generalised conditional independence property, and hence 
that all the usual properties of graphical models based upon such a relation hold. 
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2.3 Bayes linear graphical models 

H defines a Bayes linear influence diagram based upon the adjusted orthogonality relation. jl2| illustrate 
the use of Bayes linear influence diagrams in a multivariate forecasting problem. Relevant graph theoretic 
concepts can be found in the appendix of The terms moral graph and junction tree are explained in 
p2| . Briefly, an undirected moral graph is formed from a directed acyclic graph by marrying all pairs of 
parents of each node, by adding an arc between them, and then dropping arrows from all arcs. A junction 
tree is the tree of cliques of a triangulated moral graph. A tree is a graph without any cycles. A graph is 
triangulated if no cycle of length at least four is without a chord. A clique is a maximally connected subset 
of a triangulated graph. 

In this paper, attention will focus on undirected graphs. An undirected graph consists of a collection of 
nodes B = {Bi\l < i < n} for some n, together with a collection of undirected arcs. Every pair of nodes, 
{Bi, Bj} is joined by an undirected arc unless Bi II Bj / B\{Bi, Bj}. Here, the standard set theory notation, 
B\A is used to mean the set of elements of B which are not in A. An undirected graph may be obtained 
from a Bayes linear influence diagram by forming the moral graph of the influence diagram in the usual way. 

In fact, local computation (the computation of global influences of particular nodes of the graph, using 
only information local to adjacent nodes) requires that the undirected graph representing the conditional 
independence structure is a tree. This tree may be formed as the junction tree of a triangulated moral graph, 
or better, by grouping together related variables "by hand" in order to get a tree structure for the graph. 
For the rest of this paper, it will be assumed that the model of interest is represented by an undirected tree 
defined via adjusted orthogonality. 

3 Local computation on Bayes linear graphical models 

3.1 Transforms for adjusted orthogonal belief structures 
Lemma 2 If B, C and D are random vectors such that B II C/D, then 



Cov (B, C) = Cov (B, D) P 



{C}T 
[D] 



(10) 



This follows immediately from Definition || and (||) . 

Lemma 3 If X, Y and Z are random vectors such that X II Z/Y , then 



Cov x (r,Z)=(I-T^ } )Cov (Y,Z) 



(11) 



Proof 



From (§) 



Cov x (Y, Z) =Cov (Y, Z) - Cov (Y, X) Var (X) tCov (X, Z) 

=Cov (Y, Z) - pf^ } Cov (A, Y) P^ }T by Lemma | 



(12) 
(13) 



and the result follows. 



□ 



Theorem 1 If X , Y and Z are random vectors such that X II Z/Y , then 




(14) 
(15) 



Proof 



pg> =Cov (Z, X) Var (X) * 

=Cov (Z, Y) P^ }T Var (X) f by Lemma | 



(16) 



(17) 
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T X {X}, P X {X} TJY}, P X {Y} T^Z}, P X {Z} 



Var(X) Var(Y) Var(Z) 

Figure 1: Local computation along a path 

which gives (p^). Also 

rp{z} _-p{ z }-p{ x } /lO\ 
± [X] ~ r [x] r [Z] \ 10 ) 

=Cov (Z, X) Var (X) f Cov (X, Z) Var (Z) f (19) 
=Cov (Z, Y) P^ }T Var {X) tCov (X, Y) Pg }T Var (Z) f (20) 

which gives ([l5|). □ 
Theorem |l| contains the two key results which allow local computation over Bayes linear belief networks. 

3.2 Local computation on trees 

The implications of Theorem fy to Bayes linear trees should be clear from examination of Figure 1. To 

Izi {z\ 

examine the effect of observing node X, it is sufficient to compute the operators Ppq and T[ X ] for every 
node, Z on the graph, since these operators contain all necessary information about the adjustment of Z by 
X . There is a unique path from X to Z which is shown in Figure [I]. The direct predecessor of Z is denoted 
by Y. Note further that it is a property of the graph that X II Z/Y. Further, by Theorem the transforms 

\Z\ \Z\ {Y\ {Y\ 

Pm and T|jq can be computed using Ppn anc ^ ^[x] together with information local to nodes Y and Z. 
This provides a recursive method for the calculation of the transforms, which leads to the algorithm for the 
propagation of transforms throughout the tree, which is described in the next section. 

3.3 Algorithm for transform propagation 

Consider a tree with nodes B = {-Bi, . . . , B n } for some n. Each node, Bi, represents a vector of random 
quantities. It also has an edge set G, where each g £ G is of the form g = Bi] for some k, I. The resulting 
tree should represent a conditional independence graph over the random variables in question. It is assumed 
that each node, Bi has an expectation vector Egu) = E (Bi) and variance matrix V B ^ = Var (Bi) associated 
with it. It is further assumed that each edge, has the covariance matrix, CB(k),B{l) — Gov (Bk,Bi) 

associated with it. This is the only information required in order to carry out Bayes linear local computation 
over such structures. 

Now consider the effect of adjustment by the vector X, which consists of some or all of the components 
of node Bj for some j. Then, starting with node Bj, calculate and store Tg^ — T[x] an( ^ ^B(j) = P[x] • 
Then, for each node Bk € b(Bj) = {Bi\{Bi, Bj} £ G} calculate and store Tg{h) an d Pb(r), then for each 
node B\ £ b(Bk)\Bj, do the same, using Theorem [j] to calculate 

^(0 = p lHl] p BW ( 21 ) 

T B(l) ^[b^B^Ibi] ( 22 ) 

In this way, recursively step outward through the tree, at each stage computing and storing the transforms 
using the transforms from the predecessor and the variance and covariance information over and between 
the current node and its predecessor. 

Once this process is completed, associated with every node, Bi £ B, there are matrices T B ^ = ^yx] 

I g \ 

and Pb(i) — Ppn ■ These operators represent all information about the adjustment of the structure by X. 
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Note however, that X has not yet been observed, and that expectations, variances and covariances associated 
with nodes and edges have not been updated. 

It is a crucial part of the Bayes linear methodology that a priori analysis of the model takes place, and 
that the expected influence of potential observables is examined. Examination of the eigen structure of the 
belief transforms associated with nodes of particular interest is the key to understanding the structure of the 
model, and the benefits of observing particular nodes. It is important from a design perspective that such 
analyses can take place before any observations are made. See || for a more complete discussion of such 
issues, and [jl4| for a discussion of the technical issues it raises. 

3.4 Updating of expectation and covariance structures after observation 

After observation of X = x, updating of the expectation, variance and covariance structure over the tree is 
required. Start at node Bj and calculate 

E' m =E B{j) +P m (x-E(X)) (23) 
Vm=(l-T B{j ))V B{j) (24) 

(using ||). Replace Es{j) by E'bu) an d ^B(j) by VbM)- Then for each Bk 6 b(Bj) do the same, and also 
update the arc between Bj and Bk by calculating 

C B(j),B(k) ={ l - T B(j)) C B(j),B(k) (25) 

(using Lemma |J), and replacing C B (j),B(k) by ^b(j) B(fc)' Again, step outwards through the tree, updating 
nodes and edges using the transforms previously calculated. 

3.5 Pruning the tree 

Once the expectations, variances and covariances over the structure have been updated, the tree should be 
pruned. If the adjusting node was completely observed (i.e. X = Bj), then Bj should be removed from 
B, and G should have any arcs involving Bj removed. Further, leaf nodes and their edges may always be 
dropped without affecting the conditional independence structure of the graph. This is important if a leaf 
node is partially observed and the remaining variables are unobservable and of little diagnostic interest, since 
it means that the whole node may be dropped after observation of its observable components. 

If a non-leaf node is partially observed, or a leaf node is observed, but its remaining components are 
observable or of interest, then the graph itself should remain unaffected, but the expectation, variance and 
covariance matrices associated with the node and its arcs should have the observed (and hence redundant) 
rows and columns removed (for reasons of efficiency — use of the Moore-Penrose generalised inverse ensures 
that no problems will arise if observed variables are left in the system). 

3.6 Sequential adjustment 

As data becomes available on various nodes, it should be incorporated into the tree one node at a time. 
For each node with observations, the transforms should be computed, and then the beliefs updated in a 
sequential fashion. The fact that such sequential updating provides a coherent method of adjustment is 
demonstrated in 

3.7 Local computation of diagnostics 

Diagnostics for Bayes linear adjustments are a crucial part of the methodology, and are discussed in Q. It 
follows that for local computation over Bayes linear networks to be of practical value, methodology must be 
developed for the local computation of Bayes linear diagnostics such as the size, expected size and bearing of 
an adjustment. The bearing represents the magnitude and direction of changes in belief. The magnitude of 
the bearing, which indicates the magnitude of changes in belief, is known as the size of the adjustment. 

Consider the observation of data, X = x, and the partial bearing of the adjustment it induces on some 
node, Bp. Before observation of X, record E = E B ^ and V = Vb( p )- Also calculate the Cholesky factor, A 
of V, so that A is lower triangular, and V — AA T . Once the observed value X = x is known, propagate the 
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revised expectations, variances and covariances through the Bayes linear tree. The new value of Eb(p) W1 U 
be denoted E' B ^ p y Now the quantity 

E' =A\E' B{p) -E) (26) 

represents the adjusted expectation for an orthonormal basis, F = A^(B p — E) for B p with respect to the a 
priori beliefs, E and V. Therefore, E' gives the coordinates of the bearing of the adjustment with respect 
to that basis. 

The size of the partial adjustment is given by 

Siz C;c (i? p )=||£'l| 2 (27) 
where || • || represents the Euclidean norm. The expected size is given by 

E(Size x (B p )) =Tr(T B(p) ) (28) 
and so the size ratio for the adjustment (often of most immediate interest) is given by 

Sr a (i? p )= (29) 
Tr(T B(p) ) 

A size ratio close to one indicates changes in belief close to what would be expected. A size ratio smaller 
than one indicates changes in belief of smaller magnitude than would have been anticipated a priori, and 
a size ratio bigger than one indicates changes in belief of larger magnitude than would have been expected 
a priori. Informally, a size ratio bigger than 3 is often taken to indicate a diagnostic warning of possible 
conflict between a priori belief specifications and the observed data. 

Cumulative sizes and bearings may be calculated in exactly the same way, simply by updating several 
times before computing E', However, to calculate the expected size of the adjustment, in order to compute 
the size ratio, the cumulative belief transform must be recorded and updated at each stage, using the fact 
that 

ft = I-(I"TS } )(I-Tg } ) (30) 

where ^ly/] represents the partial transform for B by Y, with respect to the structure already adjusted by 
X. In other words, I minus the transforms at each stage multiply together to give I minus the cumulative 
transform. See || for a more complete discussion of the multiplicative properties of belief transforms. 

3.8 Efficient computation for evidence from multiple nodes 

To adjust the tree given data at multiple nodes, it would be inefficient to adjust the entire tree sequentially 
by each node in turn, if the nodes in question are "close together" . Here again, ideas may be borrowed from 
theory for the updating of probabilistic expert systems. It is possible to propagate transforms and projections 
from each node for which adjustment is required, to a strong root, and then propagate transforms from the 
strong root out to the rest of the tree. A strong root is a node which separates the nodes for which there is 
information, from as much as possible of the rest of the tree. In practice, there are many ways in which one 
can use the strong root in order to control information flow through the tree. An example of its use is given 
in Section M. 

3.9 Geometric interpretation, and infinite collections 

In this paper, attention has focussed exclusively on finite collections of quantities, and matrix representations 
for Bayes linear operators. All of the theory has been developed from the perspective of pushing matrix 
representations of linear operators around a network. However, the Bayes linear methodology may be 
formulated and developed from a purely geometric viewpoint, involving linear operators on a (possibly infinite 
dimensional) Hilbert space. This is not relevant to practical computer implementations of the theory and 
algorithms - hence the focus on matrix formulations in this paper. However, from a conceptual viewpoint, 
it is very important, since one sometimes has to deal, in principle, with infinite collections of quantities, or 
probability measures over an infinite partition. In fact, all of the theory for local computation over Bayes 
linear belief networks developed in this paper is valid for the local computation of Bayes linear operators 
on an arbitrary Hilbert space. Consequently, the results may be interpreted geometrically, as providing 
a method of pushing linear operators around a Bayes linear Hilbert space network. A geometric form of 
Theorem |l| is derived and utilised in |l3| . 
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Figure 2: Tree for a dynamic linear model 

4 Example: A dynamic linear model 

Figure || shows a Bayes linear graphical tree model for the first four time points of a dynamic linear model. 
Local computation will be illustrated using the example model, beliefs and data from Here, Vt, 9 t 

represents the vector (M t , N t ) T from that paper. The model takes the form 



X t =(l,0)9 t + u t 

1 1 
1 



(31) 
(32) 



where Var (6> a ) = dmg(400,9), E (6>i) = (20,0) T , E (v t ) = 0, E(w t ) = 0, Var (i/ t ) = 171, Var (w t ) = 
diag(A.75 1 0.36) Vi and the v t and <x>t are uncorrelated. 

First, the nodes and arcs shown in Figure are defined. Then the expectation and variance of each 
node is calculated and associated with each node, and the covariances between pairs of nodes joined by an 
arc is also computed, and associated with the arc. All of the expectations variances and covariances are 
determined by the model. For example, node X\ has expectation vector (20) and variance matrix (571) 
associated with it. Node Q\ has expectation vector (20, 0) T and variance matrix diag(400, 9) associated with 
it. The arc between Xi and d\ has associated with it the covariance matrix (400,0). Note that though the 
arc is undirected, the "direction" with respect to which the covariance matrix is defined is important, and 
needs also to be stored. 

The effect of the observation of X\ on the tree structure will be examined, and the effect on the node 

64 in particular, which has a priori variance matrix ( f Q j ) associated with it. Before actual 

observation of X\, the belief transforms for the adjustment, may be computed across the tree structure. The 
transforms are computed recursively, in the following order. 



Tx(i)={ 1 ) 



T e(3) = 



T 



X(3) 



0.684 


0.453 ) 



-1.342 




T 6 (i) = 



T 6 (4) = 



T 



X(2) 



0.7 


0.674 


0.479 ) 



-1.949 




7e(2) = 



0.692 




-0.692 




T X(A) = ( 0.417 ) 



(33) 

(34) 
(35) 



The P matrices are calculated similarly. In particular, Peu) — (0.7, 0) T . A priori analysis of the belief 
transforms is possible. For example, Tr(Te( 4 )) = 0.674, indicating that observation of Xi is expected to 
reduce overall uncertainty about 84 by a factor of 0.674. This is also the expected size of the bearing for the 
adjustment of 64 by X\. 
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Figure 3: Graphical models for 3-step exchangeable quantities 



Now, X\ is observed to be 17, and so the expectations, variances and covariances may be updated 



size ratio is 0.016. 

Once evidence from the observed value of X\ has been taken into account, the X\ node, and the arc 
between X\ and Q\ may be dropped from the graph. Note also that B\ then becomes an unobservable leaf 
node, which may be of little interest, and so if desired, the B\ node, and the arc between 9\ and 9i may also 
be dropped. Observation of X^ may now be considered. Using the updated, pruned tree, projections and 
transforms for the adjustment by Xi may be calculated and propagated through the tree. For example, the 

(partial) belief transform for the adjustment of 64 by X2 is ( q'q^ q'q^ ^ ■ If cumulative diagnostics are 

/ q g2 \ 92 \ 

of interest, then it is necessary to calculate the cumulative belief transform, ' „ ' _ I. This has 

J ' y 0.01 0.00 J 

trace 0.82, and so the resolution for the combined adjustment of 64 by Xi and X2 is 0.82. Similarly, the 

expected size of the cumulative bearing is 0.82. Xi is observed to be 22. The new expectations, variances 

and covariances may then be propagated through the tree. For example, the new expectation vector for 64 is 

(19.95, 0.13) T . The size of the cumulative bearing is 0.002, giving a size ratio of approximately 0.002. Again, 

the tree may be pruned, and the whole process may continue. 
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5 Example: n-step exchangeable adjustments 



An ordered collection of random quantities, {Xi,X 2 , ■ ■ ■} is said to be (second-order) n-step exchangeable 
if (second-order) beliefs about the collection remain invariant under an arbitrary translation or reflection of 
the collection, and if the covariance between any two members of the collection is fixed, provided only that 
they are a distance of at least n apart. Such quantities arise naturally in the context of differenced time 
series [ p4j . n-step exchangeable quantities may be written in the form 

X t =M + Ri, Vi (36) 

where the Ri are a mean zero n-step exchangeable collection such that the covariance between them is zero 
provided they are a distance of at least n apart. M represents the underlying mean for the collection, and Ri 
represents the residual uncertainty which would be left if the underlying mean became known. Introduction 
of a mean quantity helps to simplify a graphical model for an n-step exchangeable collection. For example, 
Figure || (top) shows an undirected graphical model for a 3-step exchangeable collection. Note that without 
the introduction of the mean quantity, M, all nodes on the graph would be joined, not just those a distance 
of one and two apart. Figure || (bottom) shows a conditional independence graph for the same collection of 
quantities, duplicated and grouped together so as to make the resulting graph a tree. Note that each node 
contains 3 quantities, and that there is one less node than observables. 

In general, for a collection of N, n-step exchangeable quantities, the variables can be grouped together 
to obtain a simple chain graph, in the obvious way, so that there are N — n + 2 nodes, each containing n 
quantities. The resulting graph for 5-step exchangeable quantities is shown in Figure || (with the first four 
nodes missing). 

2 2 2 2 2 

In ||, 3-, 4-, and 5-step exchangeable collections, {xf 5 , xf } ,...,}, {xf ] , xf ] ,...,} and {xf ] , 

/r>\ 2 

Xq ,■■■,} are used in order to learn about the quantities, Vi, V<i and V3, representing the variances under- 
lying the DLM discussed in the previous section. Since the observables sequences are 3-, 4-, and 5-step ex- 
changeable, they may all be regarded as 5-step exchangeable, and so Figure || represents a graphical model for 

the variables, where V = (Vi, V 2 , V 3 ) T , and Vi, Z l = {x\ 1] , x\ 2) , xf ] ) T . Note that V represents (a known 
linear function of) the mean of the 5-step exchangeable vectors, Z t . Each node of the graph actually contains 

(\\2 (2)^ (3)^ (1)^ (2)^ (3)^ 

15 quantities. For example, the first node shown contains {Vi, V2, V3, Xg , Xg , X5 , Xg , Xg , Xg , 

Xj ,Xy ,X) , Xg , Xg , Xg }. Note that the fact that quantities are duplicated in other nodes 
does not affect the analysis in any way. Observation of a particular quantity in one node will reduce to zero 
the variance of that quantity in any other node, as they will have a correlation of unity. Here, the fact that 
the Moore-Penrose generalised inverse is used in the definition of the projections and transforms becomes 
important. Updating for this model may be locally computed over this tree structure in the usual way. 

Suppose now that information for quantities Z^ to Zg is to become available simultaneously. This 
corresponds to information on the first two nodes (and others, but this will be conveyed automatically). The 
second node acts as a strong root for information from the first two nodes. The transform for the first node 
may be calculated using information on the first node, thus allowing computation of the transform for the 
second node given information on the first. Once the information from the first node has been incorporated 
into the first two nodes, the transform for the second node given information from the first two nodes may 
be calculated, and the resulting transform for the second node given information on the first two may be 
used in order to propagate information to the rest of the tree. 



6 Implementation considerations 

A test system for model building and computation over Bayes linear belief networks has been developed 
by the author using the MuPAD computer algebra system, described in p5| and ||. MuPAD is a very 
high level object-oriented mathematical programming language, with symbolic computing capabilities, ideal 
for the rapid prototyping of mathematical software and algorithms. The test system allows definition of 
nodes and arcs of a tree, and attachment of relevant beliefs to nodes and arcs. Recursive algorithms allow 
computation of belief transforms for node adjustment, and propagation of updated means, variances and 
covariances through the tree. Note that whilst propagating outwards through the tree, updating of the 
different branches of the tree may proceed in parallel. MuPAD provides a "parallel for" construct which 
allows simple exploitation of this fact on appropriate hardware. Simple functions to allow computation of 
diagnostics for particular nodes also exist. 
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Figure 4: A graphical model for 5-step exchangeable vectors 

7 Conclusions 

The algorithms described in this paper are very simple and easy to implement, and very fast compared to 
many other algorithms for updating in Bayesian belief networks. Further, by linking the theory with the 
machinery of the Bayes linear methodology, full a priori and diagnostic analysis may also take place. A 
priori analysis is particularly important in large sparse networks, where it is often not clear whether or 
not it is worth observing particular nodes, which may be "far" from nodes of interest. Similarly, diagnostic 
analysis is crucial, both for diagnosing misspecified node and arc beliefs, and for diagnosing an incorrectly 
structured model. 

For those who already appreciate the benefits of working within the Bayes linear paradigm, the method- 
ology described in this paper provides a mechanism for the tackling of much larger structured problems than 
previously possible, using local computation of belief transforms, adjustments and diagnostics. 
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